Smoothing-Evaluation Method in Delayed Reinforcement Learning

نویسندگان

  • Katsunari Shibata
  • Yoichi Okabe
چکیده

Another method of delayed reinforcement learning is proposed. There are two neural networks in a robot's brain, those are a evaluation network and a motion network. The evaluation network is trained so as to reduce the absolute value of the second order time derivative of the output of itself while the robot moves. The learning realize the evaluation by the necessary time until a robot gets a target and the route of a robot can be optimized under some conditions. In a simulation, a robot with an asymmetric motion characteristic could get a target along an almost optimal route, that is better than that in a comparison simulation that the evaluation function is given by the authors and only the motion is learned.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Action Refinement in Reinforcement Learning by Probability Smoothing

In many reinforcement learning applications, the set of possible actions can be partitioned by the programmer into subsets of similar actions. This paper presents a technique for exploiting this form of prior information to speed up model-based reinforcement learning. We call it an action refinement method, because it treats each subset of similar actions as a single “abstract” action early in ...

متن کامل

RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features

Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...

متن کامل

Web pages ranking algorithm based on reinforcement learning and user feedback

The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...

متن کامل

Multigrid Q-learning

Reinforcement learning scales poorly when reinforcements are delayed. The problem of propagating information from delayed reinforcements to the states and actions that have an e ect the reinforcement is similar to the problem of propagating information in a discretized boundary value problem. Multigrid methods have been shown to decrease the number of updates required to solve boundary value pr...

متن کامل

An Investigation of Reinforcement Learning for Reactive Search Optimization

Reactive Search Optimization advocates the adoption of learning mechanisms as an integral part of a heuristic optimization scheme. This work studies reinforcement learning methods for the online tuning of parameters in stochastic local search algorithms. In particular, the reactive tuning is obtained by learning a (near-)optimal policy in a Markov decision process where the states summarize rel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007